Identification of clinically relevant patient endotypes in traumatic brain injury using latent class analysis

Traumatic brain injury (TBI) is a complex condition where heterogeneity impedes the advancement of care. Understanding the diverse presentations of TBI is crucial for personalized medicine. Our study aimed to identify clinically relevant patient endotypes in TBI using latent class analysis based on comorbidity data. We used the Medical Information Mart for Intensive Care III database, which includes 2,629 adult TBI patients. We identified five stable endotypes characterized by specific comorbidity profiles: Heart Failure and Arrhythmia, Healthy, Renal Failure with Hypertension, Alcohol Abuse, and Hypertension. Each endotype had distinct clinical characteristics and outcomes: The Heart Failure and Arrhythmia endotype had lower survival rates than the Renal Failure with Hypertension despite featuring fewer comorbidities overall. Patients in the Hypertension endotype had higher rates of neurosurgical intervention but shorter stays in contrast to the Alcohol Abuse endotype which had lower rates of neurosurgical intervention but significantly longer hospital stays. Both endotypes had high overall survival rates comparable to the Healthy endotype. Logistic regression models showed that endotypes improved the predictability of survival compared to individual comorbidities alone. This study validates clinical endotypes as an approach to addressing heterogeneity in TBI and demonstrates the potential of this methodology in other complex conditions.


Definition of comorbidities
We used the well-established Elixhauser comorbidity index 31 to define chronic conditions by implementing an algorithm provided by the authors of the MIMIC-III database 32 .This algorithm groups ICD-9 codes into 30 chronic conditions to define comorbidity categories.The output is a binary matrix reflecting the presence or absence of a morbidity category by patient ID which was used in subsequent analyses.

Network discovery
We implemented network discovery to identify co-occurrences in morbidities as described previously 11 .Relative risk (RR) of observing a pair of comorbidities affecting the same patient was calculated using the formula described by Hidalgo et al. 33 .The RR is calculated from the number of patients affected by the two comorbidities, study population, and prevalence of each comorbidity (Supplementary Fig. 1).We then constructed a network of associated comorbidities by only including associations over the significance threshold of p < 0.05 to visualize the relationships between various conditions.Within this network graph, each node represents an Elixhauser comorbidity with node size corresponding to its prevalence and each edge between nodes representing an association between comorbidities with the color corresponding to the RR of the pair and the width to the number of co-occurrences.The colormap for RRs was capped at 5.0 to prevent outliers from skewing visualization.

Latent class analysis and cluster stability
Latent class analysis (LCA) uses structural equation modeling to identify different subgroups (endotypes) within study populations that share certain characteristics.The R package "poLCA" 34 was used to identify patient endotypes based on the derived 30 Elixhauser comorbidities.The LCA algorithm fits mixture models to the input data and, through optimization steps, determines the latent classes (endotypes) based on the best-fitting model.For optimization, we used previously published guidance 35,36 , including the elbow method and Akaike information criteria (AIC) to find the optimal number of endotypes.This method involves finding the inflection point of the graph and plotting the number of clusters against an evaluation metric (AIC in this case) which can be seen in Supplementary Fig. 2. We used the R package "inflection" 37 to objectively determine the inflection point to be at five clusters which represents the optimal number of endotypes.
Due to the stochastic nature of implementing LCA, we took the following additional measures to ensure model validity and cluster stability.We repeated the LCA clustering algorithm on the same data to generate 30 sets of five endotypes.The comorbidity distributions of all 150 total endotypes were then analyzed using latent profile analysis (LPA) which, like LCA, looks for clusters, but in continuous variables rather than categorical.This was done via the R package "mClust" 38 and compressed the 150 endotypes into a smaller number of recurring endotypes which are visualized in Supplementary Fig. 3.We found that the most well-fit models (models with the lowest AIC) had statistically identical endotype comorbidity profiles.Therefore, we decided to use a representative model within these low-AIC models for our subsequent analysis as this model was inferred to

Quantitative analysis/statistics and visualization
Absolute counts with percentages were reported for categorical data and mean with standard deviation (SD) for continuous data.Statistics were completed using R packages "stats" 39 and "multcomp" 40 and Python libraries "statsmodels" 41 and "SciPy" 42 .The primary outcome was defined as survival to discharge and the secondary outcomes as hospital length of stay (LOS), and neurosurgical intervention.
The Pearson Correlation analysis was conducted to assess the collinearity between age and the number of comorbidities as these were the most likely variables to demonstrate correlation.Only a weak correlation between age and number of comorbidities was found so we used multivariable logistic regression to analyze the relationship between our primary outcome of survival to discharge and number of comorbidities while accounting for key prognostic factors: age, sex, and GCS.
Pairwise chi-squared (survival and intervention rates) and t-tests (LOS) were used for the initial comparisons of clinical outcomes across endotypes and P-values were adjusted using the Holm-Bonferroni method.Multivariable logistic regression and Analysis of Covariance (ANCOVA) was used for more in-depth comparisons of clinical outcomes across endotypes: Multivariable logistic regression analysis was used to compare survival to discharge (yes/no) and neurological intervention (yes/no) between comorbidity endotypes (using the "Healthy" (HE) endotype as the reference category) adjusting for age, sex, and GCS.The odds ratios (OR) with 95% confidence intervals (CI) were reported where appropriate.ANCOVA was used to compare hospital LOS between comorbidity endotypes while adjusting for age, sex, and GCS with a Tukey post-hoc for comparisons between each pair combination of endotypes.
To obtain additional analysis resolution for our primary outcome of survival, we stratified patients within endotypes into groups based on age and GCS bins and conducted paired t-tests between the strata of the endotypes.
Additionally, we created the following multivariable logistic regression models to isolate and assess the performance of endotypes in predicting our primary outcome of survival: (1) model 1 using only the 30 Elixhauser comorbidity categories, (2) model 2 using only the 5 derived comorbidity endotypes and, (3) model 3 using both the 30 Elixhauser comorbidity categories and 5 derived comorbidity endotypes.These models were compared using ANOVA with F scores and P-values reported.Area under the curve (AUC) was also calculated for each model using the R package "pROC" 43 .The above analysis was repeated after adding age, GCS, and sex as covariates to models to assess the effect of Elixhauser comorbidities and endotypes on overall survival prediction performance.Logistic regression models and AUC calculations were also completed for a base model that included only age, GCS, and sex as well as models using only these individual variables.Cases with missing data were excluded from statistical analyses.Figures were generated via the R library "ggplot2" and the Python libraries "matplotlib" and "networkX".

Comorbidity characterization
We included 2,629 patients in our analysis.The mean age was 59.0, 61.9% were male, average GCS was 10.4, the mean number of comorbidities was 2.0, and the rate of survival to discharge was 36.7%.The demographics of our study cohort are summarized in Table 1.
The relationship between age, number of comorbidities, and survival can be visualized in Fig. 1.Age and comorbidity count was only weakly associated (r = 0.20, p < 0.001).We then tested the association of the comorbidity count with the survival to discharge after accounting for age and GCS score in a multivariable logistic regression model and found that the number of comorbidities was negatively associated with survival (OR for survival per one increase in comorbidity count = 0.984, CI: 0.975-0.994).
We next investigated the pattern of comorbidities, specifically their associations in the entire cohort to test our hypothesis that many comorbidities co-occur.These relationships between the Elixhauser comorbidity groups are visualized by the network graph in Fig. 2.This network discovery of the data revealed intuitive

Clinical characterization of comorbidity endotypes
We next analyzed the comorbidity data for the presence of endotypes using LCA.After 30 iterations, five stable clusters of patient endotypes emerged, detailed in Fig. 3 by their comorbidity probability distribution.Endotype 1 was labeled Heart Failure and Arrhythmia (HFA), characterized by cardiac comorbidities.Endotype 2 featured few comorbidities and was labelled Healthy (HE) and used as the reference endotype for subsequent analyses.Endotype 3 is characterized mainly by high rates of renal failure with complicated hypertension and was labelled Renal Failure with Hypertension (RFH).Endotype 4 was labelled Alcohol Abuse (AA), featuring high rates of alcohol abuse and relatively low rates of other comorbidities.Endotype 5 had high rates of hypertension and relatively low rates of other comorbidities, labelled Hypertension (HTN).The distribution of TBI injury severity categories was relatively similar although HE and AA endotypes contained slightly larger proportions of severe TBI at 46.9% and 39.4%, respectively (Table 2).The HFA and RFH endotypes had nearly identical age distributions of mostly older individuals in addition to a small subpopulation of middle-aged individuals (17.9% middle-aged, 82.4% old vs 19.7% middle-aged, 80.9% old, respectively).The HTN endotype was composed mainly of older adults (68.6%) with a moderate proportion of middle-aged individuals (28.3%) and a small subpopulation of young individuals (3.7%).The AA group was primarily composed of middle-aged individuals (73.7%).HE endotype was composed of mainly young (47.6%) and middle-aged (40.0%) individuals.There were minor sex differences among a few endotypes, contributed to mainly by the young age segment which consistently had a higher proportion of male patients; middle-aged and old age groups across endotypes had comparable sex distributions (Supplementary Table 2).For comorbidity count, HE had the least number of comorbidities (0.8 ± 0.8) followed by HTN (2.5 ± 1.2) while the other comorbidities had higher averages within a similar range (AA: 4.0 ± 1.2, HFA: 4.5 ± 1.6, RFH: 5.5 ± 2.1).

Survival to discharge
Multivariable logistic regression models accounting for age, sex, and GCS showed that the endotypes HE, AA, and HTN had the highest survival rates (Table 4).Using the HE endotype as reference, AA (OR = 0.999, CI: 0.935-1.066)and HTN (OR = 0.962, CI: 0.919-1.007)had statistically similar survival rates compared to HE. HFA had the lowest survival rate (OR with respect to HE = 0.803, CI: 0.747-0.862)which was statistically different from all other endotypes.RFH had lower survival than HE (OR with respect to HE = 0.889, CI: 0.821-0.964)but higher than HFA (OR with respect to HFA = 1.108,CI: 1.011-1.214).Survival rates when accounting for age, sex, and GCS are as such: HE, AA, HTN > RFH > HFA.In contrast to the initial analyses, comparison of survival rates between HE, AA, and HTN becomes statistically insignificant when including age (OR per increase in age = 0.991, CI: 0.991-0.992)and GCS (OR per increase in GCS score = 1.028,CI: 1.024-1.031).Additional investigation of these three endotypes based on stratification by age and GCS revealed that HE had higher survival in the middle-aged group compared to AA (80.3% vs 69.8%, p = 0.010) and was similar in all age groups in comparison to HTN (Table 5).However, when further sub-stratifying age groups by GCS, some substrata comparisons become significant: HE has a higher survival rate than AA for the mild GCS group in both young (97.5% vs 83.3%, p = 0.043) and middle-aged groups (94.1% vs 70.0%, p < 0.001).Although survival was not different across age strata between HE and HTN, sub-stratification by GCS revealed that HE had better survival for the mild (94.1% vs 87.2%, p = 0.043) and severe (66.7% vs 52.6%, p = 0.043) GCS groups within the middle-aged group.Radial plots representing probabilities for each Elixhauser comorbidity within a particular endotype.A higher probability for an Elixhauser comorbidity for a given endotype means that a patient with this comorbidity has a higher chance of being classified in the given endotype.For example, a patient with only uncomplicated hypertension will be classified into Endotype 5 as it has the highest probability score for uncomplicated hypertension, higher than Endotype 1.However, if the patient also has congestive heart failure or arrhythmia, its score for Endotype 1 would be higher than for Endotype 5, hence resulting in them being categorized as such.Overall, these comorbidity distributions can be thought of as the probability of the prototypical patient in an endotype to have a specific comorbidity.Although similar, these values are not the same as the prevalence rates for comorbidities in an endotype.From these endotype plots, we can see common clinically distinct patient subtypes based on their presenting comorbidities.Based on these distributions, Endotype 1 was labeled Heart Failure and Arrhythmia (HFA), Endotype 2 labelled Healthy (HE), Endotype 3 labelled Renal Failure with Hypertension (RFH), Endotype 4 was labelled Alcohol Abuse (AA) and Endotype 5 labelled Hypertension (HTN).

Predictive value of comorbidities and endotypes for survival
Logistic regression using all 30 Elixhauser comorbidity categories to predict survival to discharge accounts for ~ 15.3% of the variation in outcome (F(30, 2598) = 15.59,R 2 = 0.153, p < 0.001) and had an AUC of 0.73.A simpler logistic regression using the five derived comorbidity endotypes accounted for 11.9% of the variation in survival (F(4, 2624) = 88.61,R 2 = 0.119, p < 0.001) with an AUC of 0.69.Combining the base 30 Elixhauser categories with the endotypes in a full logistic regression model explains 17.0% of the variation in survival (F(34, 2594) = 15.57,R 2 = 0.170, p < 0.001) and has an AUC of 0.74.The decrease in model error for predicting survival to discharge seen between the base logistic regression (including only the comorbidities) and the full model (including both comorbidities and endotypes) is statistically significant (F(4, 2594) = 13.23,p < 0.001).
With the inclusion of age, GCS, and sex into the logistic model using the 30 Elixhauser comorbidities, the model explains ~ 30.5% of the variation in survival (F(33, 2548) = 35.28,R 2 = 0.305, p < 0.001) and had an AUC of 0.83.The addition of the five derived endotypes into this model results in explanation of ~ 31.7% of the variation in survival (F(37, 2544) = 31.91,R 2 = 0.317, p < 0.001) with an AUC of 0.84.The decrease in model error in predicting survival after adding endotypes is statistically significant (F(4, 2544) = 3.13, p = 0.014).A model with age, GCS, sex, and comorbidity endotypes explains ~ 28.7% of survival variation (F(7, 2574) = 142.9,R 2 = 0.287, p < 0.001) with an AUC of 0.82.For reference, a logistic model with only age, GCS, and sex accounts for 26.8% of the variation in survival with an AUC of 0.80.A model with only age accounts for 19.7% of survival variation with an AUC of 0.77; a model with only GCS accounts for 2.7% of survival variation with an AUC of 0.61; a model with only sex accounts for 0.9% of survival variation with an AUC of 0.55.

Table 5.
Stratified endotype survival outcomes.Comparison of survival after stratification by both injury severity and for Healthy (HE), Alcohol abuse (AA), and Hypertension (HTN) endotypes.Only HE, AA, and HTN were chosen for additional stratified analysis as results from regression with adjustment by age and Glasgow coma scale (GCS) differed from initial results and due to significant variations of GCS and age distributions between these three endotypes.Young: 16-39, Middle-aged: 40-69, Old: 70 + .Mild: GCS 14-15, Moderate: GCS 9-13, Severe: GCS 3-8.

Discussion
Endotypes in TBI and other conditions Our initial analyses are consistent with previous literature and clinical experience-that comorbidities are associated with clinical outcomes and that comorbidities tend to occur in patterns 44,45 .
In characterizing the relationship between comorbidities and clinical outcomes, we have identified five patient endotypes using LCA, based solely on the presence of medical conditions.These represent the most observed clinical patterns within the TBI population of the MIMIC III dataset.The notion of patient endotypes is a validated approach to precision medicine and has been used to elucidate heterogeneity in many other conditions such as sepsis and asthma 14,16,46 .In TBI, Åkerlund et al. suggested six endotypes in TBI patients based on GCS, body temperature, and lab values that added predictive value for in-hospital mortality 47 .This is the first study to consider endotypes in TBI patients utilizing comorbidities.
Our previous work 11 investigated comorbidity endotypes within all of MIMIC III, identifying six overall endotypes while this current study focuses on the TBI population and produces similar but distinct endotypes.Both studies' endotypes included hypertension, cardiac comorbidities, renal failure, and the lack of comorbidities (i.e., healthy).However, there were also major differences.While the AA endotype contained only alcohol use disorder as a comorbidity, the corresponding endotype in our previous analysis was "Hepatic addiction", which, in addition to alcohol abuse, had significant rates of liver failure and coagulopathy.As well, the current analysis did not find any endotype that corresponded to the "Cardiopulmonary" endotype in our previous study, characterized by high rates of chronic pulmonary and cardiac comorbidities.Although similarities exist, these results suggest that distinct TBI comorbidity endotypes exist that differ from those of the general ICU population.

Viewing TBI clinical outcomes by endotypes
Analyzing comorbidities via endotypes rather than individually shifts emphasis onto the fact that individual comorbidities don't exist in isolation and allows for a more comprehensive understanding of how a group of conditions affects the outcomes of acute injuries such as TBI.Additionally, it allows us to group patient presentations into clusters based on common pathophysiological patterns and clinical considerations which may facilitate decision-making.

Outcomes differences between medically complex endotypes
Heart failure comorbid with arrhythmia (HFA) and renal failure comorbid with hypertension (RFH) appear to be two common patterns of comorbidity in medically complex TBI patients.Overall, HFA has higher rates of neurosurgical intervention and mortality but similar LOS.
The association between renal failure and cardiac comorbidities is supported by many studies [48][49][50][51] and may explain notable rates of congestive heart failure (CHF) and arrhythmia in the RFH comorbidity distribution.However, it is interesting that despite the notable presence of cardiac comorbidities in RFH, the pure cardiac endotype HFA, which has fewer comorbidities overall, still exhibited poorer outcomes.A large multicenter study by Shibahashi et al. 52 found that both CHF and chronic kidney disease (CKD) increased the odds of in-hospital mortality for TBI patients.The odds ratio estimate for CKD (2.76, CI: 1.96-3.89)was notably higher than CHF (1.82, CI: 1.31-2.51),contrasting our results.However, Shibahashi et al. compared individual comorbidities while our analysis compares endotypes that contain multiple comorbidities at different rates.A possible explanation for the difference in our findings could be the notable prevalence of arrhythmias in the HFA and RFH endotypes.Patients with arrhythmias are frequently administered anticoagulant and/or antiplatelet medications, which may increase the need for intervention, risk of intervention, and mortality rates 53,54 .Hence, HFA having a higher rate of intervention and mortality may be contributed to by its higher rate of arrhythmia compared to the RFH endotype.Overall, our results suggest that the TBI patients of the cardiac-centric HFA endotype carry a poorer prognosis than the renal-centric RFH endotype.Both endotypes represent cohorts of older, clinically complex patients that fare significantly worse than the other endotypes even after controlling for age or injury severity.Future trials of TBI interventions may benefit from treating patients that fit into these endotypes as their own subgroups.

Alcohol abuse and length of stay
As alcohol abuse and hypertension are common in TBI patients 22 , we were interested in how the AA and HTN endotypes differed from HE.There is a robust relationship between alcohol use and TBI 55 that is consistent with our finding that AA is a major TBI endotype.Overall, HE and AA had similar rates of survival and neurosurgical intervention but notably higher LOS.There have been many factors linking alcohol use disorder and increased LOS such as alcohol withdrawal including seizure [56][57][58] , increased risk of hospital complications and infection, particularly of pneumonia 59 , and pre-injury cerebral atrophy secondary to neurotoxicity 60,61 .Taken together, a slow and potentially turbulent recovery may explain the increased LOS in AA.Overall, we see that the AA endotype identified via LCA is consistent with previous literature and aligns with clinical observations.

Hypertension and outcomes of intervention and survival
We found that the HTN endotype has a higher rate of intervention compared to HE and a lower rate of survival, isolated to the middle-aged strata, particularly among mild and severe injury severities.Given that chronic hypertension is known to cause and accelerate cerebrovascular damage 62 , higher rates of adverse outcomes after TBI in the HTN endotype could be explained by a less compliant, and hence less resilient, remodelled cerebrovascular system 63,64 .The same mild injuries to HTN patients may result in greater damage and hence warrant a more aggressive treatment (e.g., ICP monitoring, craniotomy).The observation that differences in survival were only significant for the middle-aged group and not for the old age group may be due to the effect of age overshadowing the weaker effect of hypertension in this age group.While pre-hospital hypertension has been linked to increased TBI mortality 65,66 , a systematic review did not find chronic hypertension to increase in-hospital mortality in TBI patients 22 .These differences may result from other studies using the presence/absence of hypertension for their effect analysis in patients that potentially have other pre-existing comorbidities, contrasting our analysis that directly compares healthy patients with patients primarily with hypertension (without other comorbidities).

Endotypes for precision medicine
LCA endotyping assesses the patient more holistically, accounting for both the presence and absence of diseases, to make comparisons between groups of patients that are similar.Linear regression attempts to describe the net effect of a single condition (e.g., hypertension) across a potentially heterogeneous population and struggles when input variables are correlated with each other, as is the case with comorbidities 67,68 .It also struggles to describe outcomes that cannot be explained by a linear combination of the model inputs whereas LCA endotypes have an aspect of non-linearity inherent to the complex relationships between input variables that they capture 69,70 .Our observed improvement in a logistic regression model's ability to predict survival by incorporating endotypes, in addition to individual comorbidities, supports the capturing of such additional non-linear effects through LCA.A large proportion of the endotypes' predictive power for survival likely comes from their capture of patient age (e.g., patients with hypertension are likely to be older than those with alcohol abuse).Such is suggested by the large amount of variation in survival explained by age alone (19.7%) combined with the minor improvement in performance seen by adding endotypes as a variable to the base model of only age, GCS, and sex (R 2 : 26.8% to 28.0%; AUC: 0.80 to 0.82).Interestingly, our comorbidity endotypes have more predictive value as an isolated variable for survival than GCS (R 2 : 11.9%vs 2.7%, AUC: 0.69 vs 0.61) suggesting that patient-inherent factors may be more important to TBI outcomes than initial injury severity as measured by GCS.
As our endotypes involved only broad comorbidity variables, the overall improved predictability offered by endotypes in our analysis was relatively modest (~ 2% in model fit by R 2 , ~ 2% in model AUC).This improvement may be larger for complex endotypes that integrate continuous variables, especially those exhibiting a parabolic association with outcomes such as blood pressure, hemoglobin, and temperature.Therefore, endotype analysis offers a different perspective on the data that may be better suited to precision medicine and prognostication than regression with individual variables.

Limitations
LCA is a method that attempts to deduce a possible explanation given the observed data points.A common concern with LCA is that results invite interpretation despite the lack of a true reasonable interpretation in reality 23,71 .Therefore, it relies on the researcher's judgement to determine if the latent classes represent meaningful entities which introduce a degree of interpretive bias into the analysis.Through our cluster selection process, we have attempted to ensure that the final set of clusters represented the simplest clinical explanation, but we cannot exclude the existence of a potentially more complex set of endotypes that better explain the variance in TBI patient comorbidities.
While the LCA endotypes help explain the variation of comorbidities across patients, there also exists more layers of variation within each identified endotype, as not all patients within endotypes are identical.We try to account for this in the analysis of our primary outcome of survival with the use of stratification by GCS and age to capture important variations common to all TBI patients.However, it could be argued that GCS and age have differential effects on outcomes depending on endotype and that there may be additional condition-specific factors to consider for each endotype.For example, creatinine measures may be relevant in RFH to account for the extent of renal failure while total cumulative alcohol intake could be used to stratify AA patients into different exposure levels.Unfortunately, due to limited granularity, and potentially limited sample size in substrata, we did not investigate this further.This approach of granular analysis could be a strategy for future research toward increasingly personalized medicine using big data.
Another limitation of this study is the size of the MIMIC III dataset.Although it is a large and comprehensive dataset, there is limited granularity that may not capture the full complexity of patient health or outcomes.We do not have clinical justifications behind decisions, such as neurosurgical intervention, which requires the context of individual patient risk assessment.The data is limited to a single center, which may not be generalizable to the overall population and may have policies and personnel preferences that introduce artifacts in variable coding.In particular, there may be inaccuracies and staff biases in the ICD9 codes that we have used for the implementation of Elixhauser comorbidities.The same analysis using another database may yield a set of comorbidity endotypes that differ in significant ways.Finally, limitations inherent to retrospective designs apply as well.Therefore, the next steps for our research will involve the replication of our methods on different TBI databases, and ultimately prospective studies, to assess the generalizability of our findings.

Conclusion
TBI, like many other conditions, is complex with varied clinical and biological presentations that contribute to their heterogeneity.In this study, we demonstrate the discovery of TBI comorbidity endotypes with distinct clinical outcomes.Our results suggest that viewing clinical variables, such as the presence or absence of a disease, in combinations is a promising way to address and better understand the heterogeneity observed in TBI and other complex conditions.This study serves to expand the applicability of the growing concept of clinical endotypes by validating the approach in a highly prevalent condition such as TBI.By utilizing approaches such as LCA that can extract meaningful combinations from large volumes of data, we can move towards precision medicine in an era that is becoming increasingly data-rich.

Figure 1 .
Figure1.Relationship between age, number of comorbidities, and mortality.Jitter plot of age, number of comorbidities, and survival to discharge across TBI cohort.Both age and number of comorbidities are linearly correlated to survival.Younger patients and patients with lower number of comorbidities tend to have better survival, suggested by the higher proportion of expired patients moving away from the origin of the plot.Age is also correlated with comorbidities as seen by the higher concentration of younger patients at lower numbers of comorbidities and the gradual shift towards older patients with increasing number of comorbidities.

Figure 2 .
Figure 2. Association between Elixhauser comorbidities.Network graph of Elixhauser comorbidity associations.Each node represents 1 of 30 Elixhauser comorbidity groups.Node size represents prevalence, edge width represents number of patients with specific co-occurrence, edge color represents strength of association in terms of relative risk (RR) calculated as described in the methodology.Many clinically intuitive associations emerge such as congestive heart failure with arrhythmia and valvular diseases, alcohol abuse with drug abuse, and renal failure with hypertension.Only associations with p < 0.05 and RR > 1.5 are shown to reduce cluttering from weaker associations.RR is capped at 5 for visualization purposes (i.e., RRs > 5 were shown only as 5) to prevent very strong associations from skewing the upper range of the color bar.

Figure 3 .
Figure 3. Comorbidity distribution of endotypes.Radial plots representing probabilities for each Elixhauser comorbidity within a particular endotype.A higher probability for an Elixhauser comorbidity for a given endotype means that a patient with this comorbidity has a higher chance of being classified in the given endotype.For example, a patient with only uncomplicated hypertension will be classified into Endotype 5 as it has the highest probability score for uncomplicated hypertension, higher than Endotype 1.However, if the patient also has congestive heart failure or arrhythmia, its score for Endotype 1 would be higher than for Endotype 5, hence resulting in them being categorized as such.Overall, these comorbidity distributions can be thought of as the probability of the prototypical patient in an endotype to have a specific comorbidity.Although similar, these values are not the same as the prevalence rates for comorbidities in an endotype.From these endotype plots, we can see common clinically distinct patient subtypes based on their presenting comorbidities.Based on these distributions, Endotype 1 was labeled Heart Failure and Arrhythmia (HFA), Endotype 2 labelled Healthy (HE), Endotype 3 labelled Renal Failure with Hypertension (RFH), Endotype 4 was labelled Alcohol Abuse (AA) and Endotype 5 labelled Hypertension (HTN).

Table 1 .
Study demographics of 2,629 included patients.Parameters of age, sex, Glasgow coma scale (GCS), number of comorbidities, survival rate, intervention rate, and hospital length of stay for the traumatic brain injury cohort.Continuous variables reported as mean ± SD.Categorical variables reported as count with percentage of population.

Table 3 .
Outcomes of endotypes.Outcomes of survival rate, neurosurgical intervention rate, and length of hospital stay (LOS) across the five stable comorbidity endotypes without any stratification.HFA: Heart failure and arrhythmia HE Healthy, RFH Renal failure with hypertension, AA Alcohol abuse, HTN Hypertension.

Table 4 .
Adjusted comparison of endotype survival outcomes.Comparison of survival rates across the five stable comorbidity endotypes after adjustment by age, sex, and Glasgow coma scale (GCS) using multivariate logistic regression.The base regression model uses the HE endotype as the reference for the endotype variable in survival odds ratio calculations.Models using the other endotypes as reference are also reported for comparison.Odds ratio for age, sex, and GCS are omitted in these models as they remained constant for all models.HFA: Heart failure and arrhythmia, HE Healthy, RFH Renal failure with hypertension, AA Alcohol abuse, HTN Hypertension.Significance codes: "***": ≤ 0.001, "**": ≤ 0.01, "*": ≤ 0.05.

Table 6 .
Adjusted comparison of endotype intervention rates.Comparison of rates of neurosurgical intervention across the five stable comorbidity endotypes after adjustment by age, sex, and Glasgow coma scale (GCS) using multivariate logistic regression.The base regression model uses the Healthy (HE) endotype as the reference for the endotype variable in intervention rate odds ratio calculations.Models using the other endotypes as reference are also reported for comparison.Odds ratio for age, sex, and GCS are omitted in these models as they remained constant for all models.HFA Heart failure and arrhythmia, RFH Renal failure with hypertension, AA Alcohol abuse, HTN Hypertension.Significance codes: "***": ≤ 0.001, "**": ≤ 0.01, "*": ≤ 0.05.